Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model
نویسندگان
چکیده
Studies on grammatical error correction (GEC) have reported the effectiveness of pretraining a Seq2Seq model with large amount pseudodata. However, this approach requires time-consuming for GEC because size In study, we explore utility bidirectional and auto-regressive transformers (BART) as generic pretrained encoder-decoder GEC. With use GEC, can be eliminated. We find that monolingual multilingual BART models achieve high performance in one results being comparable to current strong English Our implementations are publicly available at GitHub (https://github.com/Katsumata420/generic-pretrained-GEC).
منابع مشابه
A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction
We improve automatic correction of grammatical, orthographic, and collocation errors in text using a multilayer convolutional encoder-decoder neural network. The network is initialized with embeddings that make use of character Ngram information to better suit this task. When evaluated on common benchmark test data sets (CoNLL-2014 and JFLEG), our model substantially outperforms all prior neura...
متن کاملA Beam-Search Decoder for Grammatical Error Correction
We present a novel beam-search decoder for grammatical error correction. The decoder iteratively generates new hypothesis corrections from current hypotheses and scores them based on features of grammatical correctness and fluency. These features include scores from discriminative classifiers for specific error categories, such as articles and prepositions. Unlike all previous approaches, our m...
متن کاملA Hybrid Model For Grammatical Error Correction
This paper presents a hybrid model for the CoNLL-2013 shared task which focuses on the problem of grammatical error correction. This year’s task includes determiner, preposition, noun number, verb form, and subject-verb agreement errors which is more comprehensive than previous error correction tasks. We correct these five types of errors in different modules where either machine learning based...
متن کاملGenerating a Training Corpus for OCR Post-Correction Using Encoder-Decoder Model
In this paper we present a novel approach to the automatic correction of OCR-induced orthographic errors in a given text. While current systems depend heavily on large training corpora or external information, such as domain-specific lexicons or confidence scores from the OCR process, our system only requires a small amount of relatively clean training data from a representative corpus to learn...
متن کاملSimplification of the encoder-decoder circuit for a perfect five-qubit error correction
Simpler networks of encoding and decoding are necessary for more reliable quantum error correcting codes (QECCs). The simplification of the encoder-decoder circuit for a perfect five-qubit QECC can be derived analytically if the QECC is converted from its equivalent one-way entanglement purification protocol (1-EPP). In this work, the analytical method to simplify the encoder-decoder circuit is...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Shizen gengo shori
سال: 2021
ISSN: ['1340-7619', '2185-8314']
DOI: https://doi.org/10.5715/jnlp.28.276